Multidimensional cyclic graph approach: Representing a data cube without common sub-graphs
نویسندگان
چکیده
We present a new full cube computation technique and a cube storage representation approach, called the multidimensional cyclic graph (MCG) approach. The data cube relational operator has exponential complexity and therefore its materialization involves both a huge amount of memory and a substantial amount of time. Reducing the size of data cubes, without a loss of generality, thus becomes a fundamental problem. Previous approaches, such as Dwarf, Star and MDAG, have substantially reduced the cube size using graph representations. In general, they eliminate prefix redundancy and some suffix redundancy from a data cube. The MCG differs significantly from previous approaches as it completely eliminates prefix and suffix redundancies from a data cube. A data cube can be viewed as a set of sub-graphs. In general, redundant sub-graphs are quite common in a data cube, but eliminating them is a hard problem. Dwarf, Star and MDAG approaches only eliminate some specific common sub-graphs. The MCG approach efficiently eliminates all common sub-graphs from the entire cube, based on an exact sub-graph matching solution. We propose a matching function to guarantee one-to-one mapping between sub-graphs. The function is computed incrementally, in a top-down fashion, and its computation uses a minimal amount of information to generate unique results. In addition, it is computed for any measurement type: distributive, algebraic or holistic. MCG performance analysis demonstrates that MCG is 20–40% faster than Dwarf, Star and MDAG approaches when computing sparse data cubes. Dense data cubes have a small number of aggregations, so there is not enough room for runtime and memory consumption optimization, therefore the MCG approach is not useful in computing such dense cubes. The compact representation of sparse data cubes enables the MCG approach to reduce memory consumption by 70–90% when compared to the original Star approach, proposed in [33]. In the same scenarios, the improved Star approach, proposed in [34], reduces memory consumption by only 10–30%, Dwarf by 30–50% and MDAG by 40–60%, when compared to the original Star approach. The MCG is the first approach that uses an exact sub-graph matching function to reduce cube size, avoiding unnecessary aggregation, i.e. improving cube computation
منابع مشابه
Computing Data Cubes without Redundant Aggregated Nodes and Single Graph Paths: The Sequential MCG Approach
In this paper, we present a novel full cube computation and representation approach, named MCG. A data cube can be defined as a lattice of cuboids. In our approach, each cuboid is seen as a set of sub-graphs. Redundant suffixed nodes in such sub-graphs are quite common, but their elimination is a hard problem as some previous cube approaches demonstrate. MCG approach computes a data cube in two...
متن کاملNew Approach of Computing Data Cubes in Data Warehousing
The paper is dealing with data cubes built for data warehouse for OLAP purposes. OLAP (Online Analytical Processing) system offers multidimensional data analysis in which large volume of historically collected data is computed. To decrease the query time and to provide various options to the analysts, a data model was designed to organize data perfectly in a multidimensional data model. In OLAP...
متن کاملA Framework for Building OLAP Cubes on Graphs
Graphs are widespread structures providing a powerful abstraction for modeling networked data. Large and complex graphs have emerged in various domains such as social networks, bioinformatics, and chemical data. However, current warehousing frameworks are not equipped to handle efficiently the multidimensional modeling and analysis of complex graph data. In this paper, we propose a novel framew...
متن کاملWarehousing RDF Graphs∗
Research in data warehousing (DW) has developed expressive and efficient tools for the multidimensional analysis of large amounts of data. As more data gets produced and shared in RDF, analytic concepts and tools for analyzing such irregular, graph-shaped, semantic-rich data need to be revisited. We introduce the first all-RDF model for warehousing RDF graphs. Notably, we define analytical sche...
متن کاملOn-Line Analytical Processing on Graphs Generated from Social Network Data
Social Network services have quickly become a powerful means by which people share real-time messages. Typically, social networks are modeled as large underlying graphs. Responding to this emerging trend, it becomes critically important to interactively view and analyze this massive amount of data from different perspectives and with multiple granularities. While Online analytical processing (O...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Inf. Sci.
دوره 181 شماره
صفحات -
تاریخ انتشار 2011